Compiler Techniques for Reducing Data Cache Miss Rate on a Multithreaded Architecture

نویسندگان

  • Subhradyuti Sarkar
  • Dean M. Tullsen
چکیده

High performance embedded architectures will in some cases combine simple caches and multithreading, two techniques that increase energy efficiency and performance at the same time. However, that combination can produce high and unpredictable cache miss rates, even when the compiler optimizes the data layout of each program for the cache. This paper examines data-cache aware compilation for multithreaded architectures. Data-cache aware compilation finds a layout for data objects which minimizes inter-object conflict misses. This research extends and adapts prior cache-conscious data layout optimizations to the much more difficult environment of multithreaded architectures. Solutions are presented for two computing scenarios: (1) the more general case where any application can be scheduled along with other applications, and (2) the case where the co-scheduled working set is more precisely known.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimized Thread Creation for Processor Multithreading

Due to the mismatch in the speed of the processor and the speed of the memory subsystem, modern processors spend a signi"cant portion (often more than 50%) of their execution time stalling on cache misses. Processor multithreading is an approach that can reduce this stall time; however processor multithreading increases the cache miss rate and demands higher memory bandwidth. In this paper, a n...

متن کامل

Compiler-Assisted Cache Replacement: Problem Formulation and Performance Evaluation

Recent research results show that conventional hardware-only cache solutions result in unsatisfactory cache utilization for both regular and irregular applications. To overcome this problem, a number of architectures introduce instruction hints to assist cache replacement. For example, Intel Itanium architecture augments memory accessing instructions with cache hints to distinguish data that wi...

متن کامل

Compiler Generated Multithreading to Alleviate Memory Latency

Since the era of vector and pipelined computing, the computational speed is limited by the memory access time. Faster caches and more cache levels are used to bridge the growing gap between the memory and processor speeds. With the advent of multithreaded processors, it becomes feasible to concurrently fetch data and compute in two cooperating threads. A technique is presented to generate these...

متن کامل

An Empirical Study on how Program

Cache miss rates are quoted for a speciic program, cache connguration, and input set; the eeect of program layout on the miss rate has largely been ignored. This paper examines the miss variation, that is, the variation of the miss rate for instruction and data caches resulting from randomly chosen layouts. New layouts were generated by changing the order of the modules on the command line when...

متن کامل

A multithreaded

This paper describes the microarchitecture of the RS64 IV, a multithreaded PowerPC processor, and its memory system. Because this processor is used only in IBM iSeries and pSeries commercial servers, it is optimized solely for commercial server workloads. Increasing miss rates because of trends in commercial server applications and increasing latency of cache misses because of rapidly increasin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008